Skip to content

Conversation

@eramis73
Copy link
Contributor

This PR adds Google-style docstrings to public metrics in dspy/evaluate/metrics.py.

  • Ensures correctness and clarity
  • Adds small usage examples
  • Passes pre-commit hooks

resolve #8953
cc @chenmoneygithub

@chenmoneygithub chenmoneygithub self-requested a review October 20, 2025 17:32
Copy link
Collaborator

@chenmoneygithub chenmoneygithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!


def EM(prediction, answers_list): # noqa: N802
assert isinstance(answers_list, list)
"""Return True if any reference exactly matches the prediction (after normalization).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opening line should describe what this API is, instead of the API's behavior.

otherwise False.
Example:
>>> EM("The Eiffel Tower", ["Eiffel Tower", "Louvre"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render on mkdocs, let's use the block style, e.g.,:

my_code

>>> EM("paris", ["Paris"])
True
"""
assert isinstance(answers_list, list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's don't mix fix with docstring changes. Actually this assert statement won't provide more information to users

>>> round(F1("Eiffel Tower is in Paris", ["Paris"]), 2)
0.33
"""
assert isinstance(answers_list, list)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

float: The highest HotpotQA-style F1 score in [0.0, 1.0].
Example:
>>> HotPotF1("yes", ["no"])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use block code


def remove_articles(text):
return re.sub(r"\b(a|an|the)\b", " ", text)
return re.sub(r"\\b(a|an|the)\\b", " ", text)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to change this?

def answer_exact_match(example, pred, trace=None, frac=1.0):
"""Example/Prediction evaluator for answer strings with EM/F1 thresholding.
If ``example.answer`` is a string, compare ``pred.answer`` against it.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: single backtick around variables: example.answer



def answer_exact_match(example, pred, trace=None, frac=1.0):
"""Example/Prediction evaluator for answer strings with EM/F1 thresholding.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is too detailed for the open sentence

@eramis73
Copy link
Contributor Author

All feedback addressed (concise openings + mkdocs block examples)
Non-doc changes reverted
Ready for re-review. Thanks @chenmoneygithub!

Copy link
Collaborator

@chenmoneygithub chenmoneygithub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty good, thank you for the contribution, LGTM!

@chenmoneygithub chenmoneygithub merged commit e842ba1 into stanfordnlp:main Oct 27, 2025
10 checks passed
@eramis73
Copy link
Contributor Author

Thanks!

@Ziems
Copy link
Collaborator

Ziems commented Oct 28, 2025

This is so great I love these

@eramis73
Copy link
Contributor Author

Thanks!

hironow added a commit to hironow/dspy that referenced this pull request Oct 30, 2025
commit 056d54e
Author: Isaac Miller <[email protected]>
Date:   Wed Oct 29 17:23:09 2025 +0100

    fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909)

    * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot

    * remove extra logs

    * Remove log

    * Fix merge conflict

    * Remove extra whitespace

commit da69f9d
Author: TomuHirata <[email protected]>
Date:   Wed Oct 29 13:23:34 2025 +0900

    Update anthropic model name (stanfordnlp#8992)

    Signed-off-by: TomuHirata <[email protected]>

commit aaadf05
Author: Chen Qian <[email protected]>
Date:   Tue Oct 28 12:21:55 2025 -0700

    lints (stanfordnlp#8987)

commit e842ba1
Author: eramis73 <[email protected]>
Date:   Tue Oct 28 02:40:34 2025 +0300

    [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954)

    * docs(metrics): add Google-style docstrings for public metrics

    * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes

    * fixes

    ---------

    Co-authored-by: chenmoneygithub <[email protected]>

commit 6c43880
Author: TomuHirata <[email protected]>
Date:   Tue Oct 28 07:21:06 2025 +0900

    Cache Ollama to speed up CI (stanfordnlp#8972)

    * Cache Ollama to speed up CI

    * fix permission

commit 462baef
Author: Copilot <[email protected]>
Date:   Mon Oct 27 11:57:27 2025 -0700

    Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978)

    * Initial plan

    * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker

    Co-authored-by: TomeHirata <[email protected]>

    * Enhance _flatten_usage_entry to convert Pydantic models on first add

    Co-authored-by: TomeHirata <[email protected]>

    * Fix potential TypeError when both usage entries are None

    Co-authored-by: TomeHirata <[email protected]>

    * simplify

    * small fix

    * lint

    * robust version handling

    ---------

    Co-authored-by: copilot-swe-agent[bot] <[email protected]>
    Co-authored-by: TomeHirata <[email protected]>
    Co-authored-by: chenmoneygithub <[email protected]>

commit 9b467b5
Author: Noah Ziems <[email protected]>
Date:   Mon Oct 27 13:32:07 2025 -0400

    Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984)

commit bf022c7
Author: Lakshya A Agrawal <[email protected]>
Date:   Sat Oct 25 23:37:42 2025 +0530

    Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969)

    * Update gepa[dspy] dependency version to 0.0.18

    * Update pyproject.toml

    * fix test

    ---------

    Co-authored-by: TomuHirata <[email protected]>
hironow added a commit to hironow/dspy that referenced this pull request Oct 30, 2025
commit 31b96af
Author: Dushmanta <[email protected]>
Date:   Thu Oct 30 13:52:40 2025 +0530

    fix: broken PyPI downloads badge from pepy.tech in README and docs home page (stanfordnlp#8995)

    * fix: update broken pypi download badge in readme

    * fix: update broken pypi download badge in docs home page

commit 056d54e
Author: Isaac Miller <[email protected]>
Date:   Wed Oct 29 17:23:09 2025 +0100

    fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot (stanfordnlp#8909)

    * fix(MIPROv2): zero shot not taking .compile parameters into account before determining if the program was zero shot

    * remove extra logs

    * Remove log

    * Fix merge conflict

    * Remove extra whitespace

commit da69f9d
Author: TomuHirata <[email protected]>
Date:   Wed Oct 29 13:23:34 2025 +0900

    Update anthropic model name (stanfordnlp#8992)

    Signed-off-by: TomuHirata <[email protected]>

commit aaadf05
Author: Chen Qian <[email protected]>
Date:   Tue Oct 28 12:21:55 2025 -0700

    lints (stanfordnlp#8987)

commit e842ba1
Author: eramis73 <[email protected]>
Date:   Tue Oct 28 02:40:34 2025 +0300

    [docs] Add Google-style docstrings for dspy/evaluate/metrics.py (stanfordnlp#8954)

    * docs(metrics): add Google-style docstrings for public metrics

    * docs(metrics): address review feedback (concise openings, mkdocs block examples); revert non-doc changes

    * fixes

    ---------

    Co-authored-by: chenmoneygithub <[email protected]>

commit 6c43880
Author: TomuHirata <[email protected]>
Date:   Tue Oct 28 07:21:06 2025 +0900

    Cache Ollama to speed up CI (stanfordnlp#8972)

    * Cache Ollama to speed up CI

    * fix permission

commit 462baef
Author: Copilot <[email protected]>
Date:   Mon Oct 27 11:57:27 2025 -0700

    Fix TypeError when tracking usage with Anthropic models returning Pydantic objects (stanfordnlp#8978)

    * Initial plan

    * Fix TypeError when merging Anthropic CacheCreation objects in usage tracker

    Co-authored-by: TomeHirata <[email protected]>

    * Enhance _flatten_usage_entry to convert Pydantic models on first add

    Co-authored-by: TomeHirata <[email protected]>

    * Fix potential TypeError when both usage entries are None

    Co-authored-by: TomeHirata <[email protected]>

    * simplify

    * small fix

    * lint

    * robust version handling

    ---------

    Co-authored-by: copilot-swe-agent[bot] <[email protected]>
    Co-authored-by: TomeHirata <[email protected]>
    Co-authored-by: chenmoneygithub <[email protected]>

commit 9b467b5
Author: Noah Ziems <[email protected]>
Date:   Mon Oct 27 13:32:07 2025 -0400

    Add Disable Fallback Option in ChatAdapter (stanfordnlp#8984)

commit bf022c7
Author: Lakshya A Agrawal <[email protected]>
Date:   Sat Oct 25 23:37:42 2025 +0530

    Update gepa[dspy] dependency version to 0.0.18 (stanfordnlp#8969)

    * Update gepa[dspy] dependency version to 0.0.18

    * Update pyproject.toml

    * fix test

    ---------

    Co-authored-by: TomuHirata <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[docs] Add Google-style docstrings for dspy/evaluate/metrics.py

3 participants